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Final  Report  on  the  project  AFOSR  78-3495 


In  the  research  project  on  the  theory  and  applications  of  Stochastic  and 
Differential  Games  under  the  grant  AFOSR  78-3495  &  78-3495B  the  principal  in¬ 
vestigator  Professor  T.E.S.  Raghavan  and  his  former  student  Professor  J.  Filar 
(currently  at  the  Johns  Hopkins  University)  and  Professor  T.  Parthasarathy 
(currently  at  the  Indian  Statistical  Institute)  considered  several  problems  in 
the  area  of  Stochastic  Games  and  Differential  Games.  Most  notably  we  mention  a 
few. 

Problem  1  AFOSR  78-3495(1978-79):  Let  players  I  and  II  alternate  among  finitely 
many  matrix  games  with  the  law  of  motion  (namely  which  matrix  game  to  be  played 
next  time  knowning  the  current  choices  of  the  players  and  the  current  matrix 
played)  completely  determined  by  the  current  matrix  and  the  choice  of  one  player 
say  player  II  (the  minimizer).  If  payoff  is  accumulated  with  a  discount  rate  or- 


if  the  payoff  is  the  long  run  limit' per  play,  can  we  assert  order field  property? 

’  CT 

Say,  if  we  know  that  the  data  of  the  game  has  rational  entries  can  we  assert  the  □ 


value  and  some  good  stationary  strategies  have  rational  components? 

By_ 


This  is  asserted  affirmatively  in  [1]. 
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Problem  2:  Can  the  result  of  Problem  1  be  extended  to  non-cooperative  bimatrix 
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& 


Stochastic  games  for  stationary  Nash  equiliuria?  *  1  :>*>8ciai 

This  is  settled  affirmatively  in  [1). 

Under  the  guidance  of  Professor  T.E.S.  Raghavan,  J.A.  Filar  who  was  supported  by 
the  grant  as  a  research  assistant  during  this  period  submitted  his  Ph.D.  Thesis  f  *«te 
[4]  in  the  same  topic.  One  of  the  central  questions  in  the  proposal  was  the 
following. 

Problem  3:  For  any  general  Stochastic  Game  does  there  exist  a  value  when  the 
payoff  is  taken  to  be  long  run  average  income  per  play. 

Several  people  were  attacking  the  same  problem  all  around  the  world  and  in 
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his  Ph.D.  Thesis  Filar  obtained  the  following  partial  solution. 

Theorem:  If  the  decision  to  terminate  is  in  the  hands  of  one  player,  say  the 
maximizer  then  the  deciding  player  has  an  optimal  behavior  strategy.  The  other 
player  has  an  optimal  stationary  strategy  for  cyclic  games  [6]. 

From  the  point  of  view  of  applications,  stochastic  games  need  to  be  solved 
efficiently.  During  the  proposal  period  1979-80  the  main  thrust  was  in  finding 
finite  step  algorithms  for  the  Cesaro  average  payoffs  when  the  law  of  motion  is 
completely  controlled  by  one  player.  The  main  problem  in  stochastic  games  in  the 
proposal  during  (1979-81)  was 

Problem  4:  Find  an  efficient  simplex-like  finite  step  algorithm  to  find  the  value 
of  Stochastic  Games  controlled  by  one  player.  Find  an  algorithm  to  locate  at 
least  one  optimal  stationary  strategy  for  the  two  players. 

This  was  settled  in  the  paper  [3].  During  the  special  one  week  session  for 
Game  Theory  held  at  the  Oberwolfach  Mathematics  Institute,  West  Germany,  Professor 
Raghavan  was  invited  to  present  these  findings. 

In  his  Ph.D.  Thesis,  Filar  using  the  results  of  [1]  solved  affirmatively 
the  following  problem. 

Problem  5:  If  only  one  player  controls  the  law  of  motion  but  could  be  different 
different  states  does  there  exist  value  in  stationary  strategies.  Does  the  order- 
field  property  hold  good? 

The  results  have  appeared  In  [5]. 

Unexpectedly  while  trying  to  solve  problem  4  several  new  results  that  are  somewhat 
curious  were  obtained  (3).  As  a  sample  we  mention  two  theorems. 

Theorem:  Let  the  finitely  many  pure  stationary  strategies  for  player  I  and  player 
II  be  numbered  as  1,2,. ..m  and  1,2, ...n.  Let  ♦jj(s)  be  the  expected  Cesaro 


average  income  if  pure  stationary  '**•«' by 
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II,  when  the  game  starts  at  s.  Let  v(s)  -  value  of  the  matrix  game  (^(s)). 

If  only  the  same  player  controls  the  law  of  motion  at  each  state,  then  v(s) 
coincides  with  the  value  of  the  stochastic  game. 

Remark;  Such  a  theorem  fails  to  hold  in  general  if  the  controlling  player  can 
be  different  for  different  states. 

Theorem:  The  matrix  games  ($^j(s))  all  have  a  common  optimal  strategy  for  the  non¬ 
controlling  player.  This  strategy  can  be  used  to  construct  a  good  stationary  strategy 

for  the  stochastic  game.  During  the  period  1981-82,  the  main  problem  was  the  followln 
Problem  6:  Find  a  finite  step  algorithm  for  stochastic  games  where  the  controller 

can  be  different  for  different  states.  The  lack  of  any  connection  to  both  dynamic 
programming  and  matrix  games  was  the  key  difficulty  in  this  case.  While  visiting 
the  Game  Theory  Center  at  the  Katholic  University  at  the  Netherlands  this  problem 
was  mentioned  by  the  principal  investigator  to  Professor  Tljs  and  his  student 
Mr.  O.J.  Vrleze.  The  successful  finite  step  procedure  based  on  the  joint  research 
work  with  the  Dutch  Game  Theorists  is  reported  in  [7]. 

Meanwhile  these  results  have  been  drawing  the  attention  of  Game  Theoretic 
researchers  all  over  the  world.  The  algorithms  proposed  in  [1]  and  [3]  are  further 
sharpened  in  the  papers  [8],  [9].  [10]  and  [11]  for  the  non-zero  sum  case  the 

early  development  of  these  ideas  have  triggered  interest  in  [12]. 

All  along,  the  main  thrust  was  in  the  solution  of  stochastic  games,  even 
though  there  were  some  problems  solved  in  Differential  Games  as  part  of  the  proposal. 
The  main  reason  is  that  without  understanding  the  discrete  case  of  multlmove  games 

l 

it  is  not  possible  to  attack  continuous  versions.  However,  a  specific  differential  j 
game  based  on  the  toy  "Etch  and  Sketch  was  proposed  as  a  model  of  continuous  con¬ 
flict.  A  complete  solution  to  this  problem  in  the  spirit  of  Friedman  and  Fleming 
was  furnished  in  an  interim  report  submitted  to  the  AFOSR.  Since  Differential 
Games  are  essentially  continuous  versions  of  stochastic  games,  we  concentrated 


more  on  the  discrete  version  of  them  to  avoid  technical  difficulties  of  measure 


theory.  The  following  problem  was  raised  in  the  proposal  during  81-82. 

Problem  7:  Given  a  differential  game  with  the  dynamics  of  motion  controlled  by 
only  one  player's  control,  but  the  payoff  is  determined  jointly  by  both  players 
and  given  that  the  non-controlling  player  has  only  finitely  many  controls  to 
choose  from  does  there  exist  a  value  in  relaxed  controls.  If  so,  how  to  find 
the  value. 

This  problem  under  certain  regularity  conditions  is  solved  in  [2]. 

The  problems  mentioned  above  and  their  solutons  are  reported  in  the  following 
papers . 

[1]  T.G.S.  Raghavan  (with  T.  Parthasarathy) ,  An  orderfield  property  for  Stochas¬ 
tic  Games  when  one  player  controls  transition  probabilities,  J.  Optimization 
Theory  &  Applications,  vol.  33  No.  3,  1981,  375-392. 

[2]  T.G.S.  Ragahvan  (with  J.A.  Filar),  An  algorithm  for  solving  S-games  and 
Differential  S-games  -  Accepted  for  publication  -  To  appear  in  Siam  J.  Control 
and  Optimization. 

[3]  T.G.S.  Raghavan  (with  J.A.  Filar)  An  algorithm  for  solving  undiscounted 
stochastic  games  in  which  one  player  contrls  transitions  -  Invited  address  at 
the  Game  Theory  Session  -  Oberwolfach  Mathematics  Institute,  West  Germany  (1980) . 

[4]  J.A.  Filar,  Algorithms  for  solving  some  undiscounted  stochastic  games.  Ph.D. 
Thesis  submitted  to  the  University  of  Illinois  at  Chicago  Circle,  (1979). 

[5]  J.A.  Filar,  Orderfield  property  for  stochastic  games  when  the  player  who 
controls  transitions  changes  from  state  to  state,  J.  Optimization  Theory  and 
Applications,  vol.  34,  No.  4,  1981,  503-515. 

t6]  J.A.  Filar,  A  single  loop  stochastic  game  which  one  player  can  terminate, 
Opsearch,  vol.  18,  4,  1981,  185-203. 


[7]  T.E.S.  Raghavan  (with  O.J.  Vrieze,  S.H.  Tijs  and  J.A.  Filar),  A  finite 
algorithm  for  switching  control  stochastic  game,  Report  8130,  Dec.  1981, 
Mathematics  Institute,  Katholic  University,  The  Netherlands. 

The  researches  of  the  proposal  triggered  international  interest  in  the 
problems  of  our  proposal  and  the  following  contributions  in  Stochastic  Games 
resulted. 

[8]  L.C.M.  Kallenberg,  Linear  Programming  and  Markovian  Decision  Problems  - 
Ph.D.  Thesis,  University  of  Leiden,  The  Netherlands  (1980). 

[9]  O.J.  Vrieze,  Linear  programming  and  undiscounted  stochastic  games  in  which 
one  player  controls  transitions,  OR  Spectrum,  Vol  3,  (1981) ,.  29-35. 

t 10 1  A.  Hordijk  &  L.C.M.  Kallenberg,  Linear  Programming  and  Markov  Games,  1, 
Report  No.  81-04,  University  of  Leiden  1981. 

[11]  A.  Hordijk  &  L.C.M.  Kallenberg,  Linear  Programming  and  Markov  Games  2, 
Report  No.  81-06. 

[12 1  V.G.  Rothblum,  Solving  stopping  stochastic  games  by  maximizing  a  linear 
function  subject  to  quadratic  constraints.  Game  Theory  and  related  topics,  0. 
Moeschlin  &  D.  Pallaschke  (editors).  North  Holland,  Amsterdam  (1979),  103-105. 

The  research  assistantship  under  the  grant  was  fruitfully  used  to  help 
two  graduate  students  for  sunner  supports.  While  helping  in  the  project  the 
following  Ph.D.  Thesis  was  written  by  a  graduate  student. 

[13]  R.  Bapat,  On  Diagonal  Products  of  Doubly  Stochastic  Matrices  Ph.D.  Thesis 
submitted  to  the  University  of  Illinois  at  Chicago  (1981). 

The  significant  contribution  to  this  important  aspect  of  multimove  games 
resulted  in  our  Ph.D.  student  M.  J.A.  Filar  getting  an  assistant  professorship 
at  the  John  Hopkins  University.  Mr.  Bapat  after  finishing  his  thesis  joined  the 
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Northem  Illinois  University  at  DeKalb. 

Professor  Tijs  from  the  Netherlands  visited  our  campus  to  do  some  collabo¬ 
rative  research  on  switching  control  stochastic  games.  After  some  initial 
success  Professor  Raghavan  visited  the  Katholic  University,  The  Netherlands 
to  do  joint  research  with  Professor  Tijs  and  his  student  Mr.  Vrieze.  Such 
contacts  and  successful  progress  would  not  have  been  possible  without  the  support 
of  AFOSR. 

In  summary  the  principal  investigator  has  successfully  completed  the  main 
target  of  solving  for  good  strategies  of  stochastic  games.  The  major  step  is 
from  existence  to  computation.  This  is  achieved  for  the  important  class  of  games 
controlled  by  at  most  one  player  at  each  state.  Many  counter  examples  exist  for 
other  cases  to  show  that  a  general  game  is  not  solvable  in  stationary  strategies. 
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